NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Simultaneous Detection of Structural Breaks and Outliers in Time Series

https://doi.org/10.1111/jtsa.70010

Davis, Richard A; Lee, Thomas_C M; Rodriguez‐Yam, Gabriel A (July 2025, Journal of Time Series Analysis)

This article considers the problem of modeling a class of nonstationary time series using piecewise autoregressive (AR) processes in the presence of outliers. The number and locations of the piecewise AR segments, as well as the orders of the respective AR processes, are assumed to be unknown. In addition, each piece may contain an unknown number of innovational and/or additive outliers. The minimum description length (MDL) principle is applied to compare various segmented AR fits to the data. The goal is to find the “best” combination of the number of segments, the lengths of the segments, the orders of the piecewise AR processes, and the number and type of outliers. Such a “best” combination is implicitly defined as the optimizer of an MDL criterion. Since the optimization is carried over a large number of configurations of segments and positions of outliers, a genetic algorithm is used to find optimal or near‐optimal solutions. Numerical results from simulation experiments and real data analyses show that the procedure enjoys excellent empirical properties.
more » « less
Free, publicly-accessible full text available July 24, 2026
Insights into Kernel PCA with Application to Multivariate Extremes

https://doi.org/10.1137/24M1678635

Medina, Marco Avella; Davis, Richard A; Samorodnitsky, Gennady (June 2025, SIAM Journal on Mathematics of Data Science)

Free, publicly-accessible full text available June 30, 2026
Spectral learning of multivariate extremes

Avella, Marco; Davis, Richard A; Samorodnitsky, G (April 2024, Journal of Machine Learning Research)

Full Text Available
Gene syntax defines supercoiling-mediated transcriptional feedback

https://doi.org/10.1101/2025.01.19.633652

Johnstone, Christopher P; Love, Kasey S; Kabaria, Sneha R; Jones, Ross D; Blanch-Asensio, Albert; Ploessl, Deon S; Peterman, Emma L; Lee, Rachel; Yun, Jiyoung; Oakes, Conrad G; et al (January 2025, bioRxiv)

Abstract Gene syntax—the order and arrangement of genes and their regulatory elements—shapes the dynamic coordination of both natural and synthetic gene circuits. Transcription at one locus profoundly impacts the transcription of nearby adjacent genes, but the molecular basis of this effect remains poorly understood. Here, using integrated reporter circuits in human cells, we show that the reciprocal effects of transcription and DNA supercoiling, which we term supercoiling-mediated feedback, regulates expression of adjacent genes in a syntax-specific manner. Using a suite of chromatin state assays, we measure syntax-and induction-dependent formation of chromatin structures in human induced pluripotent stem cells. Applying syntax as a design parameter and without altering sequence or copy number, we built compact gene circuits, tuning the expression mean, noise, and stoichiometry across diverse delivery methods and cell types. Integrating supercoiling-mediated feedback into models of gene regulation will expand our understanding of native systems and enhance the design of synthetic gene circuits.
more » « less
Free, publicly-accessible full text available January 19, 2026
COVID-19 cases and deaths in the United States follow Taylor’s law for heavy-tailed distributions with infinite variance

https://doi.org/10.1073/pnas.2209234119

Cohen, Joel E.; Davis, Richard A.; Samorodnitsky, Gennady (September 2022, Proceedings of the National Academy of Sciences)

The spatial and temporal patterns of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) cases and COVID-19 deaths in the United States are poorly understood. We show that variations in the cumulative reported cases and deaths by county, state, and date exemplify Taylor’s law of fluctuation scaling. Specifically, on day 1 of each month from April 2020 through June 2021, each state’s variance (across its counties) of cases is nearly proportional to its squared mean of cases. COVID-19 deaths behave similarly. The lower 99% of counts of cases and deaths across all counties are approximately lognormally distributed. Unexpectedly, the largest 1% of counts are approximately Pareto distributed, with a tail index that implies a finite mean and an infinite variance. We explain why the counts across the entire distribution conform to Taylor’s law with exponent two using models and mathematics. The finding of infinite variance has practical consequences. Local jurisdictions (counties, states, and countries) that are planning for prevention and care of largely unvaccinated populations should anticipate the rare but extremely high counts of cases and deaths that occur in distributions with infinite variance. Jurisdictions should prepare collaborative responses across boundaries, because extremely high local counts of cases and deaths may vary beyond the resources of any local jurisdiction.
more » « less
Full Text Available
Time series estimation of the dynamic effects of disaster-type shocks

https://doi.org/10.1016/j.jeconom.2022.02.009

Davis, Richard; Ng, Serena (April 2022, Journal of Econometrics)

Full Text Available
Cauchy, normal and correlations versus heavy tails

https://doi.org/10.1016/j.spl.2022.109489

Xu, Hui; Cohen, Joel E.; Davis, Richard A.; Samorodnitsky, Gennady (July 2022, Statistics & Probability Letters)

Full Text Available
Handling missing extremes in tail estimation

https://doi.org/10.1007/s10687-021-00429-z

Xu, Hui; Davis, Richard; Samorodnitsky, Gennady (December 2021, Extremes)

Full Text Available
Clustering multivariate time series using energy distance

https://doi.org/10.1111/jtsa.12688

Davis, Richard A.; Fernandes, Leon; Fokianos, Konstantinos (April 2023, Journal of Time Series Analysis)

A novel methodology is proposed for clustering multivariate time series data using energy distance defined in Székely and Rizzo (2013). Specifically, a dissimilarity matrix is formed using the energy distance statistic to measure the separation between the finite‐dimensional distributions for the component time series. Once the pairwise dissimilarity matrix is calculated, a hierarchical clustering method is then applied to obtain the dendrogram. This procedure is completely nonparametric as the dissimilarities between stationary distributions are directly calculated without making any model assumptions. In order to justify this procedure, asymptotic properties of the energy distance estimates are derived for general stationary and ergodic time series. The method is illustrated in a simulation study for various component time series that are either linear or nonlinear. Finally, the methodology is applied to two examples; one involves the GDP of selected countries and the other is the population size of various states in the U.S.A. in the years 1900–1999.
more » « less
Heavy-tailed distributions, correlations, kurtosis and Taylor’s Law of fluctuation scaling

https://doi.org/10.1098/rspa.2020.0610

Cohen, Joel E.; Davis, Richard A.; Samorodnitsky, Gennady (December 2020, Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences)
null (Ed.)
Pillai & Meng (Pillai & Meng 2016 Ann. Stat. 44 , 2089–2097; p. 2091) speculated that ‘the dependence among [random variables, rvs] can be overwhelmed by the heaviness of their marginal tails ·· ·’. We give examples of statistical models that support this speculation. While under natural conditions the sample correlation of regularly varying (RV) rvs converges to a generally random limit, this limit is zero when the rvs are the reciprocals of powers greater than one of arbitrarily (but imperfectly) positively or negatively correlated normals. Surprisingly, the sample correlation of these RV rvs multiplied by the sample size has a limiting distribution on the negative half-line. We show that the asymptotic scaling of Taylor’s Law (a power-law variance function) for RV rvs is, up to a constant, the same for independent and identically distributed observations as for reciprocals of powers greater than one of arbitrarily (but imperfectly) positively correlated normals, whether those powers are the same or different. The correlations and heterogeneity do not affect the asymptotic scaling. We analyse the sample kurtosis of heavy-tailed data similarly. We show that the least-squares estimator of the slope in a linear model with heavy-tailed predictor and noise unexpectedly converges much faster than when they have finite variances.
more » « less
Full Text Available

« Prev Next »

Search for: All records